Study on Hadoop Cluster
نویسندگان
چکیده
منابع مشابه
Benchmark Hadoop and Mars: MapReduce on cluster versus on GPU
MapReduce[5] is an emerging programming model that utilizes distributed processing elements (PE) on large datasets. With this model, programmers can write highly parallelized code without explicitly dealing with task scheduling and code parallelism in distributed systems. In this paper, we comparatively evaluate the performance of MapReduce model on Hadoop[2] and on Mars[3]. Hadoop is a softwar...
متن کاملA Sensor-Oriented Information System Based on Hadoop Cluster
In order to obtain a real-time situational awareness about the specific behavior of target-of-interests out of huge-scale sensory data-set, this proposed work presents a generic sensor-oriented information system based on Hadoop cluster (SOIS-Hadoop). NoSQL database is used to store and manage the heterogeneous sensory data; Hadoop/MapReduce programming paradigm is employed to optimize the para...
متن کاملA multi-agent simulation framework on small Hadoop cluster
In this paper, we explore the benefits and possibilities about the implementation of multi-agents simulation framework on a Hadoop cloud. Scalability, fault-tolerance and failure-recovery have always been a challenge for a distributed systems application developer. The highly efficient fault tolerant nature of Hadoop, flexibility to include more systems on the fly, efficient load balancing and ...
متن کاملPerformance Evaluation of Apriori Algorithm on a Hadoop Cluster
Frequent Itemset Mining is a well-known concept in data sciences. If we feed frequent itemset miner algorithms with large datasets they become resource hungry fast as their search space explodes. This problem is even more apparent when we try to use them on Big Data. Recent advances in parallel programming provides good solutions to deal with large datasets but they present their own problems w...
متن کاملThe Recovery System for Hadoop Cluster
Due to brisk growth of data volume in many organizations, large-scale data processing became a demanding topic for industry as well as for academic fields. Hadoop is widely adopted in Cloud Computing environment for unstructured data. Hadoop is an open source, a java based distributed computing framework, and supports large-scale distributed data processing. In the recent years, Hadoop Distribu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IOSR Journal of Computer Engineering
سال: 2016
ISSN: 2278-8727,2278-0661
DOI: 10.9790/0661-1805018083